Comparative Methods for Gene Structure Prediction in Homologous Sequences
نویسندگان
چکیده
The increasing number of sequenced genomes motivates the use of evolutionary patterns to detect genes. We present a series of comparative methods for gene nding in homologous prokaryotic or eukaryotic sequences. Based on a model of legal genes and a similarity measure between genes, we nd the pair of legal genes of maximum similarity. We develop methods based on genes models and alignment based similarity measures of increasing complexity, which take into account many details of real gene structures, e.g. the similarity of the proteins encoded by the exons. When using a similarity measure based on an exiting alignment, the methods run in linear time. When integrating the alignment and prediction process which allows for more ne grained similarity measures, the methods run in quadratic time. We evaluate the methods in a series of experiments on synthetic and real sequence data, which show that all methods are competitive but that taking the similarity of the encoded proteins into account really boost the performance.
منابع مشابه
Applications of hidden Markov models for comparative gene structure prediction
Identifying the structure in genome sequences is one of the principal challenges in modern molecular biology, and comparative genomics offers a powerful tool. In this paper we introduce a hidden Markov model that allows a comparative analysis of multiple sequences related by a phylogenetic tree. The model integrates structure prediction methods for one sequence, statistical multiple alignment m...
متن کاملGene Family: Structure, Organization and Evolution
Gene families are considered as groups of homologous genes which they share very similar sequences and they may have identical functions. Members of gene families may be found in tandem repeats or interspersed through the genome. These sequences are copies of the ancestral genes which have underwent changes. The multiple copies of each gene in a family were constructed based on gene duplicati...
متن کاملApplications of Hidden Markov Models for Characterization of Homologous DNA Sequences with a Common Gene
Identifying and characterizing the structure in genome sequences is one of the principal challenges in modern molecular biology, and comparative genomics offers a powerful tool. In this paper, we introduce a hidden Markov model that allows a comparative analysis of multiple sequences related by a phylogenetic tree, and we present an efficient method for estimating the parameters of the model. T...
متن کاملSequencing and comparative analysis of flagellin genes fliA and fliB in bovine Clostridium chauvoei isolates
Background: Clostridium chauvoei is the etiological agent of blackleg as an endogenous infection in cattle. Flagella have been known to play a critical role in the protective immunity of animals to clostridial infections. C.chauvoei has two copies of fliC gene, namely fliA and fliB. OBJECTIVES: The aim of this study was the determination and nucleotide sequence analysis of both copies of fliC ...
متن کاملPrediction of 3D protein Structure based on Mutation of AKAP3 and PLOD3 Gene in Case of Non-Obstructive Azoospermia
Background: The present study has been designed with the aim of evaluating A-kinase anchoring proteins 3 (AKAP3)and Procollagen-Lysine, 2-Oxoglutarate 5-Dioxygenase 3 (PLOD3) gene mutations and prediction of 3D proteinstructure for ligand binding activity in the cases of non-obstructive azoospermic male.Materials and Methods: Clinically diagnosed cases of non-obstructive azoos...
متن کامل